A Model-based Line Detection Algorithm in Documents
نویسندگان
چکیده
In this paper we present a novel model based approach to detect severely broken parallel lines in noisy textual documents. It is important to detect and remove these lines so the text can be segmented and recognized. We use Directional Single-Connected Chain, a vectorization based algorithm, to extract the line segments. We then instantiate a parallel line model with three parameters: the skew angle, the vertical line gap, and the vertical translation. A coarse-to-fine approach is used to improve the estimation accuracy. From the model we can incorporate the high level contextual information to enhance detection results even when lines are severely broken. Our experimental results show our method can detect 94% of the lines in our database with 168 noisy Arabic document images.
منابع مشابه
An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملSequencing Mixed Model Assembly Line Problem to Minimize Line Stoppages Cost by a Modified Simulated Annealing Algorithm Based on Cloud Theory
This research presents a new application of the cloud theory-based simulated annealing algorithm to solve mixed model assembly line sequencing problems where line stoppage cost is expected to be optimized. This objective is highly significant in mixed model assembly line sequencing problems based on just-in-time production system. Moreover, this type of problem is NP-hard and solving this probl...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملNeural-Network-Aided On-line Diagnosis of Broken Bars inInduction Motors
This paper presents a method based on neural networks to detect broken rotor bars and end rings in squirrel cage induction motors. In the first part, detection methods are reviewed and traditional methods of fault detection as well as dynamic model of induction motors are introduced using the winding function method. In this method, all stator and rotor bars are considered independently in ord...
متن کاملالگوریتم جامعی برای مکان یابی خطا در خطوط انتقال دو مداره و چند پایانه ای (بیش از سه پایانه) مبتنی بر داده های PMU
A new PMU-based fault detection/location algorithm for multi-terminal transmission lines is proposed in this paper, which works on the basis of synchronized voltage and current phasors received from PMUs installed in various terminals. The Clark transform (for transposed transmission lines), Eigen-values and eigenvectors theory (for un-transposed ones) are used to decouple 3-phase differential ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003